home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
HPAVC
/
HPAVC CD-ROM.iso
/
3DGPL.ZIP
/
3DGPL
/
TEXT
/
3.TXT
< prev
next >
Wrap
Text File
|
1995-06-22
|
49KB
|
1,179 lines
2-D rasterization.
--------------------
Re graphics: A picture is worth 10K words -
but only those to describe the picture.
Hardly any sets of 10K words can be adequately
described with pictures.
Rasterization, the term itself refers to the technology we are using to
represent images: turning it into mosaic of small square pixel cells set
to different colours. The algorithms performing just that are covered in
this section.
A dot.
------
Rasterizing dot is very straightforward, I am not even sure it is
warranted to call it rasterization at all. All it involves is just setting
some location in the screen memory to particular colour. Let's recall
that the very first location of the colourmap describes top rightmost
pixel, that determines the system of references which is convenient in
this case with X axis directed to the right and Y axis directed downward.
HW_X_SIZE
(0,0)+-+-+-+- ... -+-------> X
| | | |
+-+-+ +
|
HW_Y_SIZE ...
|
+-+- +-+
| | |*|<------y
+-+-+ +-+
| ^
| |
V |
Y x
pic 3.1
Assuming that each pixel is represented by one byte, and that the dimensions
of the output surface is HW_SCREEN_X_SIZE by HW_SCREEN_Y_SIZE it is the
y*HW_SCREEN_X_SIZE+x location we have to access in order to plot a dot
at (x,y). Here's some code to do just that:
unsigned char *G_buffer; /* colourmap- offscreen buffer */
void G_dot(int x,int y,unsigned char colour)
{
G_buffer[y*HW_SCREEN_X_SIZE+x]=colour;
}
A line.
-------
Line rasterization is a very important peace of code for the library. It
would also be used for polygon rendering, and few other routines. What we
have to do is set to some specified colour all dots on the line's path.
We can easily calculate coordinates of all dots belonging to this line:
* (xend,yend) ---
/ |
* (x,y) dy
/ |
/ |y |
(xstart,ystart) *-----+ ---------
x
|
| |
|- -dx- -|
pic 3.2
Remembering a little bit of high (high?) school geometry, it can be
seen that if:
dx = xend-xstart
dy = yend-ystart
then:
x dx x * dy
--- = ---- and y = --------
y dy dx
Since we want to set all pixels along the path we just have to take
all Xs in the range of [xstart, xend] and calculate respective Y. There
is a bit of a catch, we would have only as much as xend-xstart
calculated coordinates. But if the Y range was actually longer the path
we have just found won't be continuous pic 3.3. In this case we should have
been calculating coordinates taking values from the longer range
[ystart, yend] and getting respective X, pic 3.4 as:
y * dx
x = --------
dy
....* ....*
..... ....*
...*. ...*.
..... ...*.
..*.. yrange ..*.. yrange
..... ..*..
.*... .*...
..... .*...
*.... *....
xrange xrange
pic 3.3 pic 3.4
What we just did is quite logical and easily programmable algorithm,
but... it is not that commonly used, the reason has to do with the way
we did the calculation, it involved division, and that is by far the
slowest operation any computer does. What we would do instead doesn't
involve a single division. It is called a Bresenham's integer based
algorithm (Bresenham, 1965) and here we find all points using few
logical operations and integer addition. The idea is to find a bigger
range among [xstart,xend] and [ystart,yend] go along points in it and
have a variable saying when it is time to advance in the other range.
Let's consider what is happening at some stage in line rasterization,
let's assume X is a longer range (dx>dy) and that with the line we are
rendering xstart<xend and ystart<yend:
|
| | H(x+1,y+1)
-+---|---*-
| |h/
| | | * I(x+1,real_y)
| /|
+ - - - - + - - - - + -
| / |
| | / | |l |
/ |
-*---|---*-L(x+1,y)
|P(x,y) |
pic 3.5
Pic 3.5 shows the situation where we just rendered a pixel P (Previous) at
coordinates (x,y) and now have to make a decision where to go to next, to
pixel L(x+1,y) (Lower) or H(x+1,y+1) (Higher). points P,H,L actually represent
middles of respective pixels. Since actual line would pass somewhere in the
middle between those two points we must plot the one whose centre is closer
to the intersection point I(x+1,real_y). this can be measured by comparing
segments h and l, resulted from the intersection: remembering dependency
used in the previous method we can find that:
dy * (x+1)
real_y = ------------
dx
and:
dy * (x+1)
h = y+1-real_y => h = y+1 - ------------
dx
dy * (x+1)
l = real_y-y => l = ------------ - y
dx
Now, we are interested in comparing l and h:
dy * (x+1)
l - h = 2 ------------ - 2y-1
dx
So, if l-h>0 it means that l>h the intersection point L is closer to point
H and the later should be rendered, otherwise L should be rendered. let's
multiply both sides by dx:
dx(l - h)= 2dyx + 2dy - 2dxy - dx
Now, since dx assumed bigger then 0, signs of (l-h) and dx(l-h) would be
the same. Let's call dx(l-h) as d and find it's sign and value at some step
i and next step i+1:
d = 2dyx -2dxy + 2dy - dx
i i-1 i-1
d = 2dyx -2dxy + 2dy - dx
i+1 i i
We can also find the initial value d, in the very first point since before
that x==0 and y==0:
d = 2dy-dx
1
-------------
Since sign of d determines which point to move next (H or L) lets find
value of d at some step i assuming we know it's value at i-1 from
previous calculations:
d - d = 2dyx - 2dxy - 2dyx + 2dxy
i+1 i i i i-1 i-1
d - d = 2dy(x - x ) - 2dx(y - y )
i+1 i i i-1 i i-1
Depending on the decision taken in the previous point value of d in the
next one would be either:
when H was chosen since x - x =1 and y - y =1
i i-1 i i-1
d - d = 2dy - 2dx
i+1 i
d = d + 2dy - 2dx
i+1 i
---------------------
or:
when L was chosen since x - x =1 and y - y =0
i i-1 i i-1
d -d = 2dy
i+1 i
d =d + 2dy
i+1 i
--------------
This may seem a little complicated but the implementation code certainly
doesn't look this way. We just have to find the initial value of d, and
then in the loop depending on it's sign add to it either 2dy-2dx, or
just 2dy. since those are constants, they can be calculated before the
actual loop. In the proof above we have assumed xstart<yend ystrat<yend,
however, in real life we can't guarantee that. The way we can take care
of it, is just changing increments to decrements when above assumptions
don't hold. We also have to remember that in the loop when L point is
picked it is only value belonging to the bigger range which is incremented
(decremented) whereas when picking H point both X and Y have to be changed
since that is when we are advancing along both axes.
void G_line(int x,int y,int x2,int y2,unsigned char colour)
{
int dx,dy,long_d,short_d;
int d,add_dh,add_dl;
register int inc_xh,inc_yh,inc_xl,inc_yl;
register int i;
dx=x2-x; dy=y2-y; /* ranges */
if(dx<0){dx=-dx; inc_xh=-1; inc_xl=-1;} /* making sure dx and dy >0 */
else { inc_xh=1; inc_xl=1; } /* adjusting increments */
if(dy<0){dy=-dy; inc_yh=-1; inc_yl=-1;}
else { inc_yh=1; inc_yl=1; }
if(dx>dy){long_d=dx; short_d=dy; inc_yl=0;}/* long range,&making sure either */
else {long_d=dy; short_d=dx; inc_xl=0;}/* x or y is changed in L case */
d=2*short_d-long_d; /* initial value of d */
add_dl=2*short_d; /* d adjustment for H case */
add_dh=2*short_d-2*long_d; /* d adjustment for L case */
for(i=0;i<=long_d;i++) /* for all points in longer range */
{
G_dot(x,y,colour); /* rendering */
if(d>=0){x+=inc_xh; y+=inc_yh; d+=add_dh;}/* previous point was H type */
else {x+=inc_xl; y+=inc_yl; d+=add_dl;}/* previous point was L type */
}
}
As you can see this algorithm doesn't involve divisions at all. It does
have multiplication by two, but almost any modern compiler would optimize
it to 1 bit left shift - a very cheap operation. Or if we don't have faith
in the compiler we can do it ourselves. Lets optimize this code a little
bit. Whenever trying to optimize something we first have to go after
performance of the code in the loop since it is something being done
over and over. In this source above we can see the function call to G_dot,
let's remember what is inside there,.... oh, it is y*HW_SCREEN_X_SIZE+x a
multiplication... an operation similar in length to division which
we just spent so much effort to avoid. Lets think back to the
representation of the colourmap, if we just rendered a point and
now want to render another one next to it how their addresses in the
colourmap would be different? pic 3.6
+-+-+-
A|*->|B
+|+-+-
C|V| |
+-+-
| |
pic 3.6
Well, if it is along X axis that we want to advance we just have to add 1,
to the address of pixel A to get to pixel B, if we want to advance along
Y we have to add to the address of A number of bytes in the colourmap
separating A and C. this number is exactly HW_SCREEN_X_SIZE that is length
of one horizontal line of pixels in memory. Now, using that, let's try
instead of coordinates (x,y) to calculate it's address in the colourmap:
void G_line(int x,int y,int x2,int y2,unsigned char colour)
{
int dx,dy,long_d,short_d;
int d,add_dh,add_dl;
int inc_xh,inc_yh,inc_xl,inc_yl;
register int inc_ah,inc_al;
register int i;
register unsigned char *adr=G_buffer;
dx=x2-x; dy=y2-y; /* ranges */
if(dx<0){dx=-dx; inc_xh=-1; inc_xl=-1;} /* making sure dx and dy >0 */
else { inc_xh=1; inc_xl=1; } /* adjusting increments */
if(dy<0){dy=-dy; inc_yh=-HW_SCREEN_X_SIZE;
inc_yl=-HW_SCREEN_X_SIZE;}/* to get to the neighboring */
else { inc_yh= HW_SCREEN_X_SIZE; /* point along Y have to add */
inc_yl= HW_SCREEN_X_SIZE;}/* or subtract this */
if(dx>dy){long_d=dx; short_d=dy; inc_yl=0;}/* long range,&making sure either */
else {long_d=dy; short_d=dx; inc_xl=0;}/* x or y is changed in L case */
inc_ah=inc_xh+inc_yh;
inc_al=inc_xl+inc_yl; /* increments for point adress */
adr+=y*HW_SCREEN_X_SIZE+x; /* address of first point */
d=2*short_d-long_d; /* initial value of d */
add_dl=2*short_d; /* d adjustment for H case */
add_dh=2*short_d-2*long_d; /* d adjustment for L case */
for(i=0;i<=long_d;i++) /* for all points in longer range */
{
*adr=colour; /* rendering */
if(d>=0){adr+=inc_ah; d+=add_dh;} /* previous point was H type */
else {adr+=inc_al; d+=add_dl;} /* previous point was L type */
}
}
We can see from this code that although the complexity of the preliminary
part of the algorithm went up a bit, the code inside the loop is much shorter
and simpler. There is just one thing left to add in order to make it all
completely usable - clipping that is making sure our drawing doesn't go
outside the physical boundaries of the colourmap, but this is covered in
the next chapter.
Ambient polygons
----------------
What we have to do is to fill with some colour inside the polygon's edges.
The easiest and perhaps fastest way to do it is by using "scan line"
algorithm. The idea behind it is to convert a polygon into a collection
of usually horizontal lines and then render them one at a time. The lines
have to be horizontal only because the most common colourmap would have
horizontal lines contingent in memory which automatically allows to render
them faster then vertical lines since increment by 1 is usually faster
then addition. There is one inherited limitation to the algorithm, because
at each Y heights there would be just one horizontal line only concave
polygons can be rendered correctly. It can be seen that non concave polygons
might need two or more lines sometimes pic 3.7 (that's by the way, in case
you wondered, determines the difference in definitions between concave and
non-concave polygons):
*\
| \ /*
y|---\ /-|
| \/ |
*--------*
pic 3.7
How can we represent the collection of horizontal lines? obviously we
just need to know at which Y heights it is, X coordinate of the point
where it starts and another X coordinate of the point where it ends.
So lets have one "start" and one "end" location for every Y possible
on the screen. Since every line is limited by two points each belonging
to certain edge, we can take one edge at a time find all points belonging
to it and for each of them change the "start" and "end" at particular Y
heights. So that at the end we would have coordinates of all scan lines
in the polygon pic 3.8:
start end
+-----+---+ 1 2 3 4 5 6 7 8
| 1 | 5 |- - - - - -> *------\
| 2 | 8 | \------------*
| 2 | 7 |- - - - - - -> \----------/
| 3 | 7 | \------/
| 4 | 6 |- - - - - - - -> *----*
+-----+---+
pic 3.8
Initially we would put the biggest possible value in all locations of
"start" array and the smallest possible in the locations of "end" array.
(Those are by the way are defined in <limits.h> and it is not a bad
practice to use them). Each time we've calculated the point on the edge
we have to go to "start" and "end" locations at Y heights and if X of
the point less then what we have in "start" write it there. Similar to
that if this same X is bigger then value in "end" location we have to
put it there too. Because of the way the arrays were initialized first
point calculated for every Y height would change both "start" and "end"
location. Let's look at the code to find Xs for each edge, so called
scanning function:
void GI_scan(int *edges,int dimension,int skip)
{
signed_32_bit cur_v[C_MAX_DIMENSIONS]; /* initial values for Z+ dims */
signed_32_bit inc_v[C_MAX_DIMENSIONS]; /* increment for Z+ dimensions */
signed_32_bit tmp;
int dx,dy,long_d,short_d;
register int d,add_dh,add_dl;
register int inc_xh,inc_yh,inc_xl,inc_yl;
int x,y,i,j;
int *v1,*v2; /* first and second verteces */
v1=edges; edges+=skip; v2=edges; /* length ints in each */
if(C_line_y_clipping(&v1,&v2,dimension)) /* vertical clipping */
{
dx=*v2++; dy=*v2++; /* extracting 2-D coordinates */
x=*v1++; y=*v1++; /* v2/v1 point remaining dim-2 */
dimension-=2;
if(y<G_miny) G_miny=y;
if(y>G_maxy) G_maxy=y;
if(dy<G_miny) G_miny=dy;
if(dy>G_maxy) G_maxy=dy; /* updating vertical size */
dx-=x; dy-=y; /* ranges */
if(dx<0){dx=-dx; inc_xh=-1; inc_xl=-1;} /* making sure dx and dy >0 */
else { inc_xh=1; inc_xl=1; } /* adjusting increments */
if(dy<0){dy=-dy; inc_yh=-1; inc_yl=-1;}
else { inc_yh=1; inc_yl=1; }
if(dx>dy){long_d=dx;short_d=dy;inc_yl=0;} /* long range,&making sure either */
else {long_d=dy;short_d=dx;inc_xl=0;} /* x or y is changed in L case */
d=2*short_d-long_d; /* initial value of d */
add_dl=2*short_d; /* d adjustment for H case */
add_dh=2*short_d-2*long_d; /* d adjustment for L case */
for(i=0;i<dimension;i++) /* for all remaining dimensions */
{
tmp=v1[i]; tmp<<=G_P; cur_v[i]=tmp; /* f.p. start value */
tmp=v2[i]-v1[i]; tmp<<=G_P; /* f.p. increment */
if(long_d>0) inc_v[i]=tmp/long_d; /* if needed (non 0 lines) */
}
for(i=0;i<=long_d;i++) /* for all points in longer range */
{
if(x<G_start[y]) /* further then rightmost */
{
G_start[y]=x; /* the begining of scan line */
for(j=0;j<dimension;j++)
G_rest_start[j][y]=cur_v[j]; /* all other dimensions */
}
if(G_end[y]<x) /* further the leftmost */
{
G_end[y]=x; /* the end of scan line */
for(j=0;j<dimension;j++)
G_rest_end[j][y]=cur_v[j]; /* and for other dimension */
}
if(d>=0){x+=inc_xh;y+=inc_yh;d+=add_dh;} /* previous dot was H type */
else {x+=inc_xl;y+=inc_yl;d+=add_dl;} /* previous dot was L type */
for(j=0;j<dimension;j++)
cur_v[j]+=inc_v[j]; /* for all other dimensions */
}
}
}
As you can see this edge scanning function doesn't have much differences
with our most basic line rendering code. The only thing we are updating the
vertical size of polygon so that after all edges would go through this
function we would have the correct vertical dimensions in G_miny and
G_maxy. There is another difference: scan function is designed to process,
other dimensions, that is it would interpolate X and similar to that
other values specified in vertices, finding them for all Ys in range.
We would need that when adding interpolative shading feature. As to the
ambient polygon rendering, After scanning is over we would just have
to render all horizontal scan lines in the range [G_miny,G_maxy]:
void G_ambient_polygon(int *edges,int length,unsigned char colour)
{
int new_edges[G_MAX_TUPLES]; /* verticaly clipped polygon */
int new_length; /* although no edges there yet */
register unsigned char *l_adr,*adr;
register int beg,end,i;
GI_boarder_array_init(); /* initializing the arrays */
new_length=C_polygon_x_clipping(edges,new_edges,2,length);
for(i=0;i<new_length;i++)
GI_scan(&new_edges[i*2],2,2); /* Searching polygon boarders */
if(G_miny<=G_maxy) /* nothing to do? */
{
l_adr=G_buffer+G_miny*HW_SCREEN_X_SIZE; /* addr of 1st byte of 1st line */
for(; G_miny<=G_maxy; G_miny++,
l_adr+=HW_SCREEN_X_SIZE) /* rendering all lines */
{
adr=l_adr+(beg=G_start[G_miny]); /* addr of the current line */
end=G_end[G_miny]; /* ends here */
for(;beg<=end;beg++) *adr++=colour; /* rendering single line */
}
}
}
As to the optimization it doesn't appear we can do a lot without
leaving comfortable boundaries of C. On the other hand when there's
really nothing else we can do to speed things up, when all manuals
are read and all friends are out of ideas let's do it, let's go
for assembly. I don't think rewriting all functions this way is
appealing. Moreover only rewriting the real execution bottleneck
would give considerable speed increase. That's, perhaps, one of
the reason why modern compilers have such a nice feature as inline
assembly, allowing to add low-level code directly into C source.
Looking at the code above we can see this line rendering loop. This is
an obvious candidate, since it executes fairly often and does quite a
lot of iterations, especially compared with surrounding code.
Different C compilers have different provision for inline assembly.
GNU C compiler has quite powerful syntax allowing to specify assembly
code. It is:
asm("assembly_command %1,%0":"=constraint" (result)
:" constraint" (argument),
" constraint" (argument),
...
:" clobbered_reg", "clobbered_reg" ...
);
The assembly instruction is specified with aliases allowing
to add names of C variables into it. it also allows to specify
the CPU registers the command, we are inserting, might clobber.
It is particularly important if we are expecting to compile with
optimizing option or have register variables in the code. We
don't want the program trying to use contents of a register we
just destroyed by inline assembly instruction. If we are specifying
the clobbered registers, on the other hand, compiler would make
sure that no useful information is in this registers at the
moment inline assembly code starts to execute.
WATCOM C compiler has different provision for inline assembly:
return_type c_function(parameter1,parametr2, ...);
#pragma aux c_fucntion= \
"assembly_command" \
"assembly_command" \
parm [reg] [reg] \
value [reg] \
modify [reg reg];
It is implemented by providing sort of a pseudo c-function which
is treated by the compiler pretty much like a macro. "parm" specifies
which registers take in parameters from function's argument list,
"value"- from what register, if any, the return value is taken and
"modify", specifies clobbered registers.
for DJGPP compiler code substituting line rendering loop:
for(;beg<=end;beg++) *adr++=colour; /* rendering single line */
may look like:
asm("movl %0,%%ecx" :: "g" (end):"%ecx");
asm("subl %0,%%ecx" :: "g" (beg):"%ecx");
asm("movl %0,%%edi" :: "g" (adr):"%edi");
asm("movl %0,%%eax" :: "g" (colour):"%eax");
asm("cld"); /* increment when stosb */
asm("rep");
asm("stosb %al,(%edi)"); /* rendering line here */
for WATCOM:
void a_loop(unsigned char *adr,int end,int beg,int colour);
#pragma aux a_loop= \
"sub ecx,esi" \
"cld" \
"rep stosb" \
parm [edi] [ecx] [esi] [al];
Not a terribly complicated code to write but something giving
considerable speed increase. Then again we might have used a C
library function memset for filling chunks of memory:
void *memset(void *s,int c,size_t n);
memset(l_adr,colour,end-beg); /* filling scan line */
This function might have had just that code, we wrote, inside of
it. let's not believe assumptions neither pro nor con and go check
that out. with DJGPP or, in this matter, most UNIX compilers we can
just extract and disassemble memset code from libc.a (standard
c library) and see for ourselves. let's go:
ar -x libc.a memset.o
objdump -d memset.o
We would see something like:
memset.o: file format coff-go32
Disassembly of section .text:
00000000 <_memset> pushl %ebp
00000001 <_memset+1> movl %esp,%ebp
00000003 <_memset+3> pushl %edi
00000004 <_memset+4> movl 0x8(%ebp),%edi
00000007 <_memset+7> movl 0xc(%ebp),%eax
0000000a <_memset+a> movl 0x10(%ebp),%ecx
0000000d <_memset+d> jcxz 00000011 <_memset+11>
0000000f <_memset+f> repz stosb %al,%es:(%edi)
00000011 <_memset+11> popl %edi
00000012 <_memset+12> movl 0x8(%ebp),%eax
00000015 <_memset+15> leave
00000016 <_memset+16> ret
00000017 <_memset+17> Address 0x18 is out of bounds.
Disassembly of section .data:
Very similar to what we wrote? Other compilers might have
worse library code? it might be so, let's try WATCOM C,
let's go:
wlib clibs.lib *memset
wdisasm memset.obj
that's what we would see:
Module: memset
Group: 'DGROUP' CONST,CONST2,_DATA,_BSS
Segment: '_TEXT' BYTE USE32 00000023 bytes
0000 57 memset push edi
0001 55 push ebp
0002 89 e5 mov ebp,esp
0004 8b 45 10 mov eax,+10H[ebp]
0007 8b 4d 14 mov ecx,+14H[ebp]
000a 8b 7d 0c mov edi,+0cH[ebp]
000d 06 push es
000e 1e push ds
000f 07 pop es
0010 57 push edi
0011 88 c4 mov ah,al
0013 d1 e9 shr ecx,1
0015 f2 66 ab repne stosw
0018 11 c9 adc ecx,ecx
001a f2 aa repne stosb
001c 5f pop edi
001d 07 pop es
001e 89 f8 mov eax,edi
0020 c9 leave
0021 5f pop edi
0022 c3 ret
No disassembly errors
------------------------------------------------------------
Is it not very similar to what we just did using inline assembly?
WATCOM C code is even trying to be smarter then us by storing as
much as it can by word store instruction, since every word store
in this case is equivalent to two char stores, there should be
twice less iterations. Of course there would be some time spent
on the function call itself but most likely performance of the
inline assembly code we wrote and memset call would be very very
similar. It is just yet another example of how easy it might be
in some cases to achieve results by very simple means instead of
spending sleepless hours in front of the glowing box (unfortunately
only sleepless hours teach us how to do things by simple means but
that's the topic for another story only marginally dedicated to
3-D graphics).
By the way, in the provided source code there is an independent
approach to fast memory moves and fills. That is functions to perform
those are brought out into hardware interface as: HW_set_char ;
HW_set_int; HW_copy_char; HW_copy_int. And their implementation
may actually be either of the ways described above depending on
the particular platform.
Shaded polygons.
----------------
Shaded polygons are used to smooth appearance of faceted models,
that is those where we approximate real, curved surfaces by a
set of plane polygons. Interpolative or Gouraud shading (Gouraud, 1971),
we would discuss, is the easiest to implement, it is also not very
expensive in terms of speed. The idea is that we carry a colour
intensity value in every vertex of the polygon and linearly interpolate
those values to find colour for each pixel inside the polygon.
Finally this is the place where implemented N-dimensional scanning
procedure comes in handy. Indeed colour intensity can be pretty
much handled as just an extra dimension as far as edge scanning
and clipping are concerned.
*I3
/ \
/ \
I0* \
\ *I2
\ I /
IT1*---*---*IT2
\ /
\ /
\ /
*I1
pic 3.9
We can obtain values on the left and right boarders of a polygon (IT1,
IT2 on pic 3.9) by interpolating colour intensity value in every edge
during edge scanning, just as "start/end' X values were found for every
horizontal line. Afterwards when rendering certain horizontal scan line we
can further interpolate "start" and "end" intensities for this line finding
colour for pixels belonging to it (I on pic 3.9). Fixed point math is
probably a very convenient way to implement this interpolation:
void G_shaded_polygon(int *edges,int length)
{
int new_edges[G_MAX_SHADED_TUPLES]; /* verticaly clipped polygon */
int new_length;
register unsigned char *l_adr,*adr;
register int beg,end,i;
register signed_32_bit cur_c,inc_c; /* current colour and it's inc */
GI_boarder_array_init(); /* initializing the array */
new_length=C_polygon_x_clipping(edges,new_edges,3,length);
for(i=0;i<new_length;i++)
GI_scan(&new_edges[i*3],3,3); /* Searching polygon boarders */
if(G_miny<=G_maxy) /* nothing to do? */
{
l_adr=G_buffer+G_miny*HW_SCREEN_X_SIZE; /* addr of 1st byte of 1st line */
for(; G_miny<=G_maxy; G_miny++,
l_adr+=HW_SCREEN_X_SIZE) /* rendering all lines */
{
adr=l_adr+(beg=G_start[G_miny]); /* addr of the current line */
end=G_end[G_miny]; /* ends here */
cur_c=G_0_start[G_miny]; /* colour starts with this value */
inc_c=G_0_end[G_miny]-cur_c;
if(end>beg) inc_c/=end-beg+1; /* f.p. increment */
for(;beg<=end;beg++) /* render this line */
{
*adr++=cur_c>>G_P; /* rendering single point */
cur_c+=inc_c; /* incrementing colour */
}
}
}
}
As you see code above looks not that much different from ambient polygon
rendering source.
There is one small catch. I said that intensities can be handled pretty
much as an extra dimension. In fact we can consider shaded polygon on
the screen to take 2 space dimensions, and to have one colour dimension.
But would all vertexes of this polygon belong to one plane in such 3-D space?
If they would not, shading would look rather weird especially when this
polygon would start to rotate. The common solution is limit polygons to just
triangles. Why? well three points always belong to the same plane in
3-D space.
Textured polygons.
------------------
It might seem that we can extend interpolation technique used to
calculate colour intensity for shading, onto texture rendering. Just
keep texture X and Y in every vertex of the polygon, interpolate
those two along edges and then along horizontal lines obtaining
this way texture (X,Y) for every pixel to be rendered. This however
does not work... Or to be more exact it stops working when perspective
projection is used, the reason is simple: perspective transformation
is not linear.
One may ask, why did it work for interpolative shading, after all
we were using perspective transformation all along? The answer is
it did not work, but visual aspect of neglecting perspective effect
for shading is quite suitable, neglecting it for texture rendering
however is not. I believe it has to do with different image frequencies.
Just consider the following situation: a rectangular polygon is
displayed on the screen so that it's one side is much closer
to us then the other one, which is almost disappearing in the
infinity: where do you think in the texture map lines A,B,C and D
would be mapped from by linear interpolating method? how do you think
the texture on the polygon would look like?
+\ +--/---/+1
A|--\ A|/ / |
B|----\ B|/ |
C|----/ C|\ <------ Missed area
D|--/ D|\ \ |
+/ +--\---\+2
Polygon on the Texture
Screen Map
pic 3.10
Well it would look like if someone has carved triangular area
marked "missed" on the picture above, and connected points 1 and
2 together, real ugly... In less dramatic cases texture rendered this
way might look nicer, being perfect when effect of perspective
transformation is negligible: all points are at almost the same
distance from the viewer, or very far away from the viewing plane.
To get always nice looking results we would have to consider
what is really happening when texture is being rendered:
u^ Texture Space
| *T(u,v)
+---->
o v
y ^
| z
| U \ /*W(x,y,z)/ V World Space
| / \ /
| / \ /
|/ * O
+--------------> x
j^
|
| *S(i,j) Screen Space
|
+------->i
pic 3.11
Consider three spaces pictured above: the texture space, world space (the
one before projection) and finally contents of the screen - screen or image
space. We may think of them just as original polygon/texture map in case
of texture space, polygon/texture map turned somehow in case of world
space and finally same projected onto the screen.
If we know mapping of origin and two orthogonal vectors from texture
space into the world space, we can express mapping of any point through
them:
x = Ox + v*Vx + u*Ux
y = Oy + v*Vy + u*Uy
z = Oz + v*Vz + u*Uz
Where Vx...Uz are corresponding components of the respective vectors.
Further point in the world space W(x,y,z) is being perspectively
transformed into the screen space S(i,j):
i = x / z
j = y / z
(the actual transformation used, would most likely also contain a
multiplier - focus, but let's worry about particularities on some later
stage). In order to perform mapping, on the other hand, we need u,v
texture coordinates as function of screen coordinates i,j
x = i * z
y = i * z
or:
Ox + v*Vx + u*Ux = i * [ Oz + v*Vz + u*Uz ]
Oy + v*Vy + u*Uy = j * [ Oz + v*Vz + u*Uz ]
trying to express u,v through i,j:
v*(Vx-i*Vz) + u*(Ux-i*Uz) = i*Oz - Ox
v*(Vy-j*Vz) + u*(Uy-j*Uz) = j*Oz - Oy
further:
(i*Oz - Ox) - u*(Ux - i*Uz)
v = -----------------------------
(Vx - i*Vz)
(i*Oz - Ox) - v*(Vx - i*Vz)
u = -----------------------------
(Ux - i*Uz)
and:
(i*Oz - Ox) - u*(Ux - i*Uz)
----------------------------- * (Vy - j*Vz) + u*(Uy - j*Uz) = j*Oz - Oy
(Vx - i*Vz)
(i*Oz - Ox) - v*(Vx - i*Vz)
v*(Vy - j*Vz) + ----------------------------- * (Uy - j*Uz) = j*Oz - Oy
(Ux - i*Uz)
or in nicer form:
i*(Vz*Oy-Vy*Oz) + j*(Vx*Oz-Vz*Ox) + (Vy*Ox-Vx*Oy)
u = --------------------------------------------------
i*(Vy*Uz-Vz*Uy) + j*(Vz*Ux-Vx*Uz) + (Vx*Uy-Vy*Ux)
i*(Uy*Oz-Uz*Oy) + j*(Uz*Ox-Ux*Oz) + (Ux*Oy-Uy*Ox)
v = --------------------------------------------------
i*(Vy*Uz-Vz*Uy) + j*(Vz*Ux-Vx*Uz) + (Vx*Uy-Vy*Ux)
At this point we have formulas of from where in the texture space the
point in the screen space originates. There are several implementational
problems, first few are simple, the actual perspective transformation
is i=x*focus/z rather then i=x/z, plus, we are performing screen centre
translation so that screen space (0,0) is in the middle of the screen
and not in the upper left corner. Hence original dependency should have
been:
i = (x * focus) / z + HW_SCREEN_X_CENTRE
j = (y * focus) / z + HW_SCREEN_Y_CENTRE
And so finally, so called, reverse mapping formulas have to be amended
respectively:
((i-HW_SCREEN_X_CENTRE)/focus) * (Vz* ...
u = -------------------------------------------
...
...
Another, a bit difficult part has to do with texture scaling.
If the size of texture vectors was picked to be 1 that would
assume no scaling, that is unit size along the polygon in the
world space would correspond to unit size in the texture space.
What we however want to do in a lot of instances is to map
smaller texture onto a bigger polygon or the other way around,
bigger texture onto a smaller polygon. Let's try scaling the
texture space:
T
*---------->
v| * | S
| | \ *--------------------->
| | \ | v'| * | v T
| *-- ---* | | \ ----- = -----
V- - - - - | | | \ | v' S
| | \
| |mapping \ | v'*T
| |of the \ v = -----
| |polygon \ | S
| *---------------*
V- - - - - - - - - - -|
pic 3.12
now we can put expression for v into direct mapping equations:
Vx*T Ux*T
x = Ox + v'* ------ + u'* ------
S S
...
Vx and Ux multiplied by T in fact are just projections of vectors
enclosing the hole texture space, let's call them V'x == Vx*T,
U'x == Ux*T etc.
V'x U'x
x = Ox + v'* ----- + u'* -----
S S
...
considering this reverse mapping equations would look like:
((i-HW_SCREEN_X_CENTRE)/focus) * (V'z* ...
u'= S * --------------------------------------------
...
...
This would allow us to scale texture on the polygon by changing S
multiplier.
Another problem has to do with point O - mapping of the origin of the
texture space into the world space. The thing is, if the mapping of this
point gets onto Z==0 plane mapping equations above no longer hold. Just
due to the fact that perspective transformation is having singularity
there. In general we deal with this problem of perspective transformation
by clipping the polygon against some plane in front of Z==0. And do we
really need mapping of exactly O of the texture space? Not necessarily,
it may be any point in texture space assuming we change a bit direct
mapping equations:
x' = Ox + (v'-v'0)*V'x + (u-u0)*U'x
...
where (v'0,u'0) are coordinates in the texture space of the point we
have a mapping for in the world space that is (Ox,Oy,Oz). And the
reverse mapping equations would be then:
((i-HW_SCREEN_X_CENTRE)/focus) * (V'z* ...
u'= S * ------------------------------------------- + u'0
...
...
How can we obtain a mapping of, now, any point from the texture
space into the world space? If we would associate with every
polygon's vertex besides just x,y,z also u,v of where this vertex
is in the texture, we can treat that as 5-D polygon and perform
clipping on it, obtaining vertexes with coordinates suitable
for perspective transformation, hence any of them would be fine
for the mapping equations. (How to do clipping on polygons is
covered in the next chapter).
Last thing, in order to render regular patterns, those repeating
themselves, we may want to introduce texture size parameter,
and make sure that if we want to access texture outside this
size, this access would wrap around to pick a colour from somewhere
inside the texture. This can be easily done if texture size
is chosen to be some power of 2, for in this case we can
perform range checking with just logical "and" operation.
However, using above equations for texture mapping appear to be
quite expensive, there are few divisions involved per each rendered
pixel. How to optimize that? The, perhaps, easiest way is: horizontal
line subdivision, that is calculating real texture mapping every N pixels
and linearly interpolating in between, this method is used in the
implementation below:
void G_textured_polygon(int *edges,int length,int *O,int *u,int *v,
unsigned char *texture,int log_texture_size,
int log_texture_scale
)
{
int new_edges[G_MAX_SHADED_TUPLES]; /* verticaly clipped polygon */
int new_length;
signed_32_bit Vx,Vy,Vz;
signed_32_bit Ux,Uy,Uz; /* extracting vectors */
signed_32_bit x0,y0,z0;
signed_32_bit ui,uj,uc;
signed_32_bit vi,vj,vc;
signed_32_bit zi,zj,zc; /* back to texture coeficients */
signed_32_bit v0,u0;
signed_32_bit xx,yy,zz,zzz;
int xstart,xend;
signed_32_bit txstart,tystart;
signed_32_bit txend,tyend;
signed_32_bit txinc,tyinc;
register unsigned char *g_adr,*adr;
register int i,x,y;
signed_32_bit txtrmasc=(0x1<<(log_texture_size+G_P))-0x1;
log_texture_scale+=G_P;
GI_boarder_array_init(); /* initializing the array */
new_length=C_polygon_x_clipping(edges,new_edges,4,length);
for(i=0;i<new_length;i++)
GI_scan(&new_edges[i*4],2,4); /* Searching polygon boarders */
if(G_miny<=G_maxy) /* nothing to do? */
{
x0=O[0]; y0=O[1]; z0=O[2];
u0=O[3]<<G_P; v0=O[4]<<G_P; /* world point <-> texture point */
Vx=v[0]; Vy=v[1]; Vz=v[2];
Ux=u[0]; Uy=u[1]; Uz=u[2]; /* extracting vectors */
ui=(Vz*y0)-(Vy*z0);
uj=(Vx*z0)-(Vz*x0);
uc=(Vy*x0)-(Vx*y0);
vi=(Uy*z0)-(Uz*y0);
vj=(Uz*x0)-(Ux*z0);
vc=(Ux*y0)-(Uy*x0);
zi=(Vy*Uz)-(Vz*Uy);
zj=(Vz*Ux)-(Vx*Uz);
zc=(Vx*Uy)-(Vy*Ux); /* back to texture */
g_adr=G_buffer+G_miny*HW_SCREEN_X_SIZE; /* addr of 1st byte of 1st line */
for(; G_miny<=G_maxy; G_miny++,
g_adr+=HW_SCREEN_X_SIZE) /* rendering all lines */
{
xstart=G_start[G_miny];
adr=g_adr+xstart;
xstart-=HW_SCREEN_X_CENTRE;
x=xstart;
xend=G_end[G_miny]-HW_SCREEN_X_CENTRE;
y=G_miny-HW_SCREEN_Y_CENTRE;
xx=((y*uj)>>T_LOG_FOCUS)+uc;
yy=((y*vj)>>T_LOG_FOCUS)+vc;
zz=((y*zj)>>T_LOG_FOCUS)+zc; /* valid for the hole run */
if(( zzz=zz+((x*zi)>>T_LOG_FOCUS) )!=0)
{
txend=( (xx+ ((x*ui)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +u0;
tyend=( (yy+ ((x*vi)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +v0;
}
for(;xstart<xend;)
{
x+=G_LINEAR; if(x>xend) x=xend; /* size of linear run */
txstart=txend;
tystart=tyend;
if(( zzz=zz+((x*zi)>>T_LOG_FOCUS) )!=0)
{
txend=( (xx+ ((x*ui)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +u0;
tyend=( (yy+ ((x*vi)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +v0;
}
length=x-xstart; /* ends here */
if(length!=0) { txinc=(txend-txstart)/length;
tyinc=(tyend-tystart)/length;
}
for(;xstart<x;xstart++) /* linear run */
{
txstart&=txtrmasc; tystart&=txtrmasc; /* wrap around */
*adr++=*(texture+((tystart>>G_P)<<log_texture_size)+(txstart>>G_P));
txstart+=txinc; tystart+=tyinc; /* new position in the texture */
}
}
}
}
}
Other possible speed-up techniques are: area subdivision involving
2-D interpolation, faking texture mapping with polynomials (later
has not very much to do with the true mapping equations described
here, however) and rendering texture along constant z lines (and
not along horizontal line) the advantage gained by the former is
that along every constant Z line perspective transformation is
linear ( perspective transformation is i=x/z if z==const we
can write it as i=coef*x where coef=1/z which is linear of course)
The function above might be extended with interpolative shading
as well. To do that special consideration has to be given to
palette's layout, that is where and how to keep different intensities
of the same colour. But once decided coding is quite straightforward.
* * *